Jin Xu, Zhejiang University, simplecat123@zju.edu.cn PRIMARY
Shuilin Ren, Zhejiang University,
shuilinren@foxmail.com
Yubo Tao, Zhejiang University, taoyubo@cad.edu.cn SUPERVISOR
Hai Lin, Zhejiang University, lin@cad.zju.edu.cn SUPERVISOR
Student
Team: YES
Did
you use data from both mini-challenges? No
D3
Excel
Oracle database
Approximately how many hours were spent working on
this submission in total?
We
spent about 220 hours on this submission.
May we post your submission in the Visual Analytics
Benchmark Repository after VAST Challenge 2015 is complete? YES
Video:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
MC2.1 – Identify those IDs that stand out for their large volumes of communication. For each of these IDs
a. Characterize the communication patterns you see.
b. Based on these patterns, what do you hypothesize about these IDs?
Limit your response to no more than 4 images and 300 words.
Fig. 1-1 the volumes of
communication on Friday, Saturday, Sunday
In order to
identify those IDs that stand out for their large volumes of communication, we
put the data in an Oracle database, and accumulate the communication in each
hour. As shown in the Fig. 1-1, we see that ID 1278894, 839736, external and
the rest top 24.
1.
ID
1278894
By observing communication of 1278894 we
see that 1278894 sends messages from 12:00 to 21:00. In the time interval,
1278894 sends messages every 5minutes in an hour and then stop one hour.
1278894 sends large volumes of messages at the same time. Comparing to the
number of sending and receiving messages among three days, we discover
that 1278894 never sends messages at other time. 1278894’s
communication pattern is regular.
As shown in Fig. 1-2, 1278894 sends
messages to some persons in two groups (not all of them). We also find that
1278894 stays at Entry Corridor among the three days. In the Dinofun World map,
we can see that id 60 is used for daily slab maps and information in the Entry
Corridor. So we infer that 1278894 works for telling the daily affairs to
visitors around the park and stays in ID 60 location.
2.
ID
839736
839736 stays in Entry Corridor and
communication pattern is similar between sending and receiving. 839736 receives
large messages between 8:00and 10:00 on Friday and Saturday. But 839736
receives large messages from 11:30 to 12:00 on Sunday ,there might be a
vandalism event and then visitors send to 839736 for querying the event .So we
can infer that ID 839736 works for consulting.
Fig. 1-2
3.
ID
external
ID External means the communications between outside
and inside, people inside usually like to share the information about the
activities or unexpected events. So the time when the external receives large
messages might infer the stage show time or vandalism.
Fig.1-3: external communication on
Saturday and Sunday
4.
Group
ID
We identify the top 24 largest
IDs from the rest. (ID: 1116329, 1045021, 1250941, 918738, 128533, 1749109,
1427875, 1388162, 49375, 1300247, 970490, 484248, 1508923, 530908,810123,
992045, 1280922, 38622, 174974, 171002, 1692925, 856067, 1410699, 74616).
We find that they have the similar
communication patterns. We search from the database and find that they always
send large messages simultaneously. And topological relations among them are
the same in the three days. They go through the park and send or receive
messages from 8:30 to 23:00 each day. eg: ID 1116329 in the figure . We think
they are more likely to be staff.
Fig. 1-4
MC2.2 – Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.
Limit your response to no more than 10 images and 1000 words.
According to the
plan of Scott’s weekend, there would be two stage shows each on Friday, Saturday
and a Scott’s talk on Sunday. In addition, a show of memorabilia would be
displayed on each day. Therefore, if there is no vandalism, the visiting
patterns each on Friday, Saturday and Sunday should be similar. In the analysis
of the features of volumes of communication changing over time and the
relationships, all park visitors and staff can be classified. The features can
be shown in Fig.2-1
Fig.2-1
Pattern 1: Visitors got together on Coaster Alley to
watch the stage show on Friday and Saturday.
Typical ID:
686348,1821472,1862455,1251090
There are 4
typical clusters on Friday and 11 such typical clusters on Saturday. A cluster
shown in Fig.2-2, for example, which has 37 members, 16 leaders, only
communicates with the members in the inside cluster, ID 1278894 and ID 839736,
arrived at Coaster Alley at 14:47:59, stopped communicating at 14:49:03 and
started to communicate again at 16:00:02. There might be a stage show during the
period.
Fig.2-2: Pattern 1
Pattern 2: Visitors on Sunday didn’t get together on
Coaster Alley to watch the talk on Sunday.
Typical ID: 60390,
187692, 773078, 874532
On Sunday, there
are no typical clusters like pattern one. While there is a typical cluster of
visitors on Sunday, which has 44 members, 19 leads, only communicate with the
members in the same cluster, ID 1278894 and ID 839736. For the purpose of
attending the Scott’s talk as planned, the cluster arrived at Coaster Alley at
14:37:28, but their leads received messages from ID 1278894 at 14:45:00, which
may inform them that the talk is canceled. So they left Coaster Alley at once.
Fig2-3: Pattern 2
Pattern 3: Extra security guards appeared on Sunday
only
Typical ID:
2047906, 955733, 1038892, 378256
This cluster is
the most typical among clusters on Sunday. It has 37 members, 19 leads. The cluster communicates with the members
inside the cluster, and two other persons (1278894 and 839736). This cluster
has the largest volumes of communication between 11:00 and 12:00. Cluster
members entered the park at Entry Corridor at 9:32:38, and they directly went
to Wet Land at 9:41:11 and stayed there until 12:06:49. Creighton Pavilion is
in the Wet Land, where would be a show of memorabilia. The news reported that
the extra security to the Pavilion would add to ensure visitors’ safety.
Therefore, the cluster members are more likely to be the extra security guards.
This cluster suddenly sent a lot of messages at 11:32:58 that indicates the
vandalism has been discovered. And this
cluster stays at Wet Land until the problem was solved.
Fig.2-4: Pattern 3
Pattern 4: Staff appears on three days and
communicates from morning to night.
Typical ID:
935776,714380,733140,805298
This pattern is
about staffs. In general, staff appears on
three days, communicate from morning to night, the volumes of their communication are relatively stable. As
shown in fig2-5, this pattern is similar to staffs. They might belong to the same group on
account of the same roles each day. The two clusters have few associations,
which can be easy to understand that they may have different roles in the park
and they may also need to exchange information. The two clusters both send
messages from each location in the park, which can be inferred that they have to
patrol every corner.
Fig2-5: Pattern 4
Pattern 5: Location suspect
ID:416790,1187909,1502920,1123214,1350546,461004,1000279
1. The cluster stayed
at Wet Land all the time on Sunday when the vandalism occurred.
2. They stay other
locations for a very short time with the purpose of the avoidance of doubt.
They also did nothing at other locations.
3. They suddenly
sent a lot of messages at 12:00 which can be inferred that they might be
talking about the progress result.
Fig 2-6: Pattern 5
Pattern 6: Suspect
ID:
962171,1558676,458709,630410,912123,1948458
1. The cluster continues
to communicate all day except the period between 9:31:38 and 11:05:21, and a
member send a message at 11:05:21 which indicates that they are at Wet Land.
But when the vandalism was discovered at 11:32, they left Wet Land. Visitors
may be more inclined to go to the Site of the incident, instead of going away.
2. The members
also appeared on Saturday, and their routes are similar on Sunday which they
spend almost all day at Wet Land.
3. The cluster can
communicate with the staff on Saturday but they only communicate with each
other on Sunday.
Fig 2-7: Pattern 6
Pattern 7: Suspect
ID: 1279196,1385263,1646340,1892771,484032
The cluster only appeared in the morning, and
in the period they spend most time at Wet Land.
Fig 2-8: Pattern 7
MC2.3 – From this data, can you hypothesize when the
crime was discovered? Describe your
rationale.
Limit your response to no more than 3 images and 300 words.
We can get the information that the crime was
discovered at 11:32:58.
We can see the entry to Creighton Pavilion is on the
location of Wet Land. So we focus on the Communications in Wet Land. By observing
the statistical information based on the following attributes: time, location,
communications, we see that the communications trend is similar on Friday and
Saturday, but the Sunday’s communications between 11:00 and 12:00 is quite
larger than others (Figure 3-1).
Fig. 3-1
According to the communications number, we know that
ID 839736 who is a consultant staff in the park. Every visitor can ask him/her
for help. Seeing the communication both received and send of 839736 (Fig3-2),
the communications numbers suddenly become quite large. We infer vandalism
information spread around the Wet Land, many people are talking about the event
and ask 839736 for making sure the activity will be hold as planned. Analyzing
the ID external (Fig3-2), we can also find that people suddenly send large
messages to outside between 11:30 and 12:00 at Wet Land. So vandalism should be
discovered before this time.
Fig. 3-2
We have analyzed the cluster called “extra security
guards” who suddenly sent a lot of messages at 11:32:58 at Wet Land that
indicates the vandalism has been discovered.
And this cluster stays at Wet Land until the problem was solved at
12:06:49.
Fig. 3-3